Ontology Based Semantic Similarity Comparison of Documents

نویسندگان

  • Vladimir A. Oleshchuk
  • Asle Pedersen
چکیده

In this work we consider ontologies as knowledge structures that specify terms, their properties and relations among them to enable knowledge extraction from texts. We represent ontologies using a graph-based model that reflect semantic relationship between concepts and apply them to text analysis and comparison. Instead of raw document comparison we compare document footprint enhanced with concepts from the ontology (using di erent enhancement algorithms). The result of this process may be that documents not similar prior to the enhancement become similar (semantically on some abstraction level) after the enhancement. This is because the enhancement process may introduce in the document footprint abstract concepts from the ontology. Using the ontology we can enhance the footprints by adding concepts that are not present in the original document. We may use synonyms for a horizontal expansion and broader terms/superclasses/types in a vertical expansion or both for that matter.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Aggregating Similarity Measures based Ontology on Documents Retrieval

This paper investigates a methodology for the ontology based semantic retrieval of annotated web documents with terms occurrence weighting. The semantic structural distance of document terms in terms of domain ontology is computed against new unknown queries to improve the documents ranking and retrieval. Furthermore, the role of aggregation methods to combine the weighting terms scheme based s...

متن کامل

An Efficient Cross Ontology-based Similarity Measure for Bio-document Retrieval System

In Biomedical research, retrieving documents that match an interesting query is a task performed quite frequently. Typically, the set of obtained results is extensive containing many non-interesting documents and consists in a flat list, i.e., not organized or indexed in any way. In this paper, we have presented an efficient bio-medical document retrieval system with the proposed cross-ontology...

متن کامل

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...

متن کامل

A Hybrid Approach using Ontology Similarity and Fuzzy Logic for Semantic Question Answering

One of the challenges in information retrieval is providing accurate answers to a user’s question often expressed as uncertainty words. Most answers are based on a Syntactic approach rather than a Semantic analysis of the query. In this paper our objective is to present a hybrid approach for a Semantic question answering retrieval system using Ontology Similarity and Fuzzy logic. We use a Fuzzy...

متن کامل

Semantic Information Retrieval based on Wikipedia Taxonomy

Information retrieval is used to find a subset of relevant documents against a set of documents. Determining semantic similarity between two terms is a crucial problem in Web Mining for such applications as information retrieval systems and recommender systems. Semantic similarity refers to the sameness of two terms based on sameness of their meaning or their semantic contents. Recently many te...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003